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Abstract 

We consider the problem of estimating a signal corrupted by independent interference with the assistance of a 
cost-constrained helper who knows the interference causally or noncausally. When the interference is known causally, 
we characterize the minimum distortion incurred in estimating the desired signal. In the noncausal case, we present a 
general achievable scheme for discrete memoryless systems and novel lower bounds on the distortion for the binary 
and Gaussian settings. Our Gaussian setting coincides with that of assisted interference suppression introduced by 
Grover and Sahai. Our lower bound for this setting is based on the relation recently established by Verdii between 
divergence and minimum mean squared error. We illustrate with a few examples that this lower bound can improve 
on those previously developed. Our bounds also allow us to characterize the optimal distortion in several interesting 
regimes. Moreover, we show that causal and noncausal estimation are not equivalent for this problem. Finally, we 
consider the case where the desired signal is also available at the helper. We develop new lower bounds for this setting 
that improve on those previously developed, and characterize the optimal distortion up to a constant multiplicative 
factor for some regimes of interest. 



C/3 



I. Introduction 

Consider a joint source channel coding problem as depicted in Figure [1] We have two memoryless sources 
Si (the desired signal) and 52 (the interfering signal). The decoder's aim is to estimate the source sequence S*" 
from Y", with the goal of minimizing the average per symbol distortion E{J2"^i d{Sii{Y"), Sii))/n. The encoder 
(helper), who knows the interfering signal 52, aids the decoder in reconstructing the signal 5i through his choice 
of X, subject to a cost constraint p{X). 
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Fig. 1: Estimation with a helper who knows the interference. The interfering signal is 5J while the desired signal is 
5". The encoder (helper) tries to help the decoder in estimating 5" by reducing the interference due to 52, subject 
to a per symbol cost constraint on its transmission X". 
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Applications may arise in sensor networks or cognitive radio systems. As a motivating example, suppose Alice is 
talking to Bob in his office. As a result of ongoing construction work near Bob's office, there is high interference 
which makes it hard for Bob to listen to Alice. Fortunately, Bob recently purchased a noise cancellation device 
which has a microphone placed near the construction site. The microphone measures the interfering signal from 
the construction site and transmits it to a noise cancellation speaker situated in Bob's office. Since electromagnetic 
waves travel faster than sound, the noise cancellation speaker knows the interfering signal noncausally. Due to a 
power constraint on the speaker, it cannot cancel the interference fully. What then, is the minimum distortion that 
can be achieved by Bob in trying to reconstruct Alice's speech? 

Our setup is closely related to several strands of work involving communication over channels with states. In 
m, the authors considered the problem of State Amplification, where a message is to be sent to the decoder and 
the decoder also forms a list of possible 5J sequences. The goal is to maximize the message transmission rate and 
reduce the uncertainty the decoder has regarding S2', i.e. reduce the list size of possible 6*2 sequences. Recently, 
the problem of state amplification with a distortion constraint was considered in ||2l, with an additional condition 
that the encoder only knows the state 5*2 causally. This setting is similar to our setting, with the main difference 
being that the decoder wishes to reconstruct 5*2 rather than 5*1. When our setting is specialized to the Gaussian case 
with the mean squared error distortion between the reconstruction and the signal, our setting becomes equivalent 
to the problem of Assisted Interference Suppression considered in |l3|. As detailed in ||3|, this problem is closely 
related to Witsenhausen's counterexample in Optimum Control Theory ||4|. 

In this paper, we consider both the case when 5*2 is available causally at the encoder, and the case when S2 is 
available non causally at the encoder. Our main contributions are as follows: 

1) When 5*2 is available causally at the encoder, we characterize the minimum achievable distortion in 5*1. We 
borrow certain ideas used in the characterization of the distortion cost region for the causal state amplification 
problem in ^ to establish our result. 

2) For the noncausal setting, we first give an achievable scheme for the general discrete memoryless system 
and then focus our attention on the case where 5*1 and S2 are independent Bemoulli random variables and 
the distortion measure is Hamming. We give two lower bounds on the achievable distortion for this binary 
setting. The first lower bound is based on ideas from the Assisted Interference Suppression problem [iJj, while 
the second lower bound is based on ideas from the problem of Compression with Actions Q. Neither bound 
contains the other and one bound can be better than the other, depending on the regime of interest. Using our 
lower and upper bounds, we characterize the minimum achievable distortion in several regimes. In particular, 
we provide an example to show that causal and noncausal estimation of Si are not equivalent and causal 
knowledge of S2 could incur a higher distortion than noncausal knowledge of S2 at the encoder. A complete 
characterization of the minimum achievable distortion in the noncausal case remains open. 

3) In the Gaussian case, where Si and 6*2 are independent Gaussian random variables with finite variance, the 
distortion measure is the mean square error and Y = X + Si + S2, we note that our setting coincides 
with that of Assisted Interference Suppression ||3]. For this setting, we give a lower bound on the minimum 
achievable distortion which in some places improves on that given in O, and also its improved version given 
in Q. The proof of our lower bound relies on an application of Verdu's relation between relative entropy 
and mismatched estimation in Gaussian noise iTj. In recent years, since the seminal paper 18] established 
the relationship between minimum mean square error estimation (MMSE) in Gaussian noise and the Mutual 
Information between the signal and the output, there has been interest in applying these information-estimation 
relations to problems in Information Theory (see e.g. and iflOl ). Our lower bound, which seems difficult 
to obtain by traditional techniques such as the Entropy Power Inequality ifTTl Chapter 2], provides another 
application of these information-estimation relations. 

4) In the Gaussian case, we also consider the setting when the encoder has access to 5*1 noncausally, in addition 
to 5*2. This setting is a special case of a problem considered in |[T2l . We give a lower bound for this setting 
that contains the previous bounds in |T2| and can be strictly better in some cases. Furthermore, we establish 
constant gap results between the achievable distortion and our lower bound. 

We first provide the formal definitions in the next section. In Section [Till we consider the causal case. In 
Section HVl we consider the noncausal case, present an achievable scheme for general discrete memoryless systems 
and analyze the binary setting in detail. Section |V] deals with the Gaussian version of this problem, while we 



consider the Gaussian setting when 5*1 is also available noncausally at the encoder in Section |Vl] We conclude in 
Section VII with a summary of our findings and directions for future work. 

II. Definitions 
In this section, we give formal definitions for our problem settings. We will follow the notation of ifTTI . 
and assume throughout this paper that the channel in consideration is memoryless. That is, p(y"|a;", s", Sj) = 
Y[i=iP{yi\xi, Sii, S2i)- We also assume that 5"" and 52 are independent i.i.d. sequences. 

A. Estimation with interference known at the helper 

A (n,C) code for the setting shown in Figure [1] when the interference is known noncausally consists of 

• An encoder that maps the interference S2 to X", f : S2 ^ X"; 

• A decoder that maps the output F" to the reconstruction sequence S*", g : 3^" — > 5"; 

such that E^"^j^ p(Xi)/7i < C. The expected per symbol distortion, D, is given hy D = E(i(S'",5'") = 
Ej:tidiSu, Su)/n. 

A distortion D is said to be achievable under the cost constraint C if there exists a sequence of (n, C + e„) 
codes, where e„ ^^ as 71 ^- 00, and 

limsupEd(S'i",S'r) <D. 

n— >oo 

The minimum achievable distortion, D{C)nun, is then defined as the infinum of the set of achievable distortions 
under the cost constraint C. 

When the interference is only known causally, the definitions are mostly the same, with the difference being that 
the encoder is restricted to causal mapping: 

f,:Si^Xfoxie[l: n]. 

B. Estimation with source and interference known at the helper 

This setting is shown in Figure |2] For this setting, we restrict attention to the case where Si and 5*2 are 
independent Gaussian random variables, ^i ^ N{Q,Pi) and ^2 ^ N{Q,P2)- Furthermore, we assume that both 
S'l and 5*2 are known noncausally at the encoder, and the distortion measure is the mean square error between ^i 
and its reconstruction. That is, d{si, si) = (si — si)^. The channel is specified hy Y ~ X ^ Si + S2 + Z , where 
Z ^ A/'(0, A'^) is independent of 5i and S2. The cost constraint is the expected power constraint: E(^"^j^ Xf /n). 
As the definitions are similar to the previous setting, we only mention the difference. That is, the encoder now 
maps both S'^ and SJ to X": 

f -.SI' X S^ -^ X". 



III. Causal Estimation with a helper 

In this section, we give the distortion-cost tradeoff region for the setting given in III-AI under the condition that 
the interfering signal, 5*2, is causally known at the encoder. We will discuss some connections between our setting 
and that of the problem of Causal State Amplification discussed in ||2l. 

Theorem 1. The distortion-cost region for the problem of estimation with a helper when the interfering signal is 
causally known at the encoder is given by 

D{CUn= min Ed{Si,SiiU,V,Y)) 

U,V.X,Si 
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Fig. 2: Gaussian estimation with a helper that knows both the interference and the source. The random variables 
Si, S2 and Z are independent zero mean Gaussian random variables. The encoder has knowledge of S*" and ^2^ 
noncausally and the decoder tries to perform lossy reconstruction of S*". The distortion criterion is the mean square 
error criterion and the cost constraint is the expected power constraint on the encoder output, X. 



for some p{u)p{v\u, S2)p{si)p{s2) and functions a;(u, S2) <^nd Si{u,v,y) such that 

I(U;Y)>I{V;S2\U,Y), 
Ep{X) <C. 

The cardinalities of the auxiliary random variables may be upper bounded by \U\ < \S2\ 
\V\<\U\i\S2\ + l). 



\X\ - 1) + 2 and 



The achievability scheme in this Theorem is actually the same as that used in the problem of Causal State 
Amplification considered in ||2l, where the focus was on reconstructing 5*2 instead of 5*1. The expressions for the 
distortion-cost tradeoff are also similar, with the difference being that in the Causal State Amplification setting, 
one is interested in minimizing the distortion between iS'2 and its reconstruction, rather than between Si and its 
reconstruction. Of course, the optimizing choice of auxiliary random variables in the two problems are different, 
since in our setting, we try to minimize the interference (^2) as much as possible subjected to a cost constraint, 
whereas in the setting of Causal State Amplification, one tries to amplify the interfering signal. As a (trivial) 
example, consider the case when Si,S2,X G {0,1} and Y ~ X (B Si(BS2 and no cost constraint. Then, clearly, in 
our problem of causal estimation with a helper, we set X — S2 io cancel out the interference completely, thereby 
recovering Si losslessly. In contrast, for the problem of Causal State Amplification, we will not cancel out S2, 
since that is the signal we are trying to recover. 

Theorem [T] gives the optimal cost-distortion tradeoff for the estimation problem when the encoder knows the 
interfering signal causally. A natural question to ask is whether there is any penalty incurred in this restriction? In 
the next section, we will give an example of a binary estimation with a helper problem under Hamming loss and 
show that there is indeed a penalty incurred in only knowing the interfering signal causally. 
Proof of Theorem Q} 

Sketch of Achievability: As the achievability scheme is similar to that in f2\, we give only a sketch in Appendix U 
for completeness. 

Converse: Given a (n, C) code that achieves distortion D, we have 
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< I(Y^+„Q, S^-';Yq) - I{Y^+,,St\Q; S2q) 
= I{U,V;Y)-IiU,V;S2), 

where in (a), we used the Csiszar sum lemma |fT3l; in (b), we used the fact that 5*2 is a memoryless source; in 
(c), we defined Q in the standard manner to be uniformly distributed over [1 : n] and independent of every other 
random variable; and in the last step, we define U ~ {Uq,Q) ~ (82 :Q) and V ~ Vq = ^g+i- With these 
definitions of auxiliary random variables, it is clear that U is independent of ^2 and also, the encoder output X, 
is a function of both U and 5*2. Further, using the relationship that U is independent of 6*2 and V — {U, S2) — Y, 
the condition that /([/, V; Y) - /([/, V] S2) > reduces to 

IiU;Y)>IiV;S2\U,Y). 

It now remains to show that the achievable distortion can be lower bounded by this choice of auxiliary random 
variables. To this end, we will use a technique for lower bounding distortion found in |fT4l. We have 
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where the last step follows from the observation that we can recover Su from S'l^ by simply ignoring 5*2 ^. Next, 
consider the term Ed{Su,S[i{Y,,V^,Y*-\ 82^^)). 

Ed{Su,Su{Y^,V,,Y'-\S'2~')) 

= EdiSu,S[,{Y„V,,Y''-\U,)) 

= ^Pisii,Ui, Vi, yi, y'~^)d{sii,S[i{yi,Vi, y'~^,Ui)) 

= ^Piui,yi, Vi)^p{y'~'^,su\ui, yj, Vi)d{sii, S[i{yi, Vi,y'-~'^ ,Ui)) 

= ^P{wt)^p{y'^^\w.i)p{su\wt)d{su,S[^{w,,y'~'^)) 

= ^Piwt) ^ p{y'■~^\w^)^p{su\w^)d{su, S[^{w„y'~^)) 

y»-i su 

> ^p{w^) ^ p{y'■~^\w^)^p{su\w^)d{su, Sl^{w^)) 
y'-i su 

= y^^pjwj, Su)d{sii, Sl^jWj)) 

^Ed{Su,SUW,)), (2) 

wherein (a), we define tWi = {ui,yi,Vi) fornotationalconvenienceand the fact that p(y*~^, siijwi) ~ p{y''^^\wi)p{sii\wi) 
follows from the Markov Chain y*~i — Wi — Su, which in turn, follows from the fact that ^2 is only causally 
known at the encoder. Hence, given ^2^^ and also X'~^ since it is a function of 5*2"^, F*~^ is independent of Su. 
{b) follows from defining y*"^* = argmin^^i-i I]si. -P('5iiki)c^('Sii, '5'n(wi, y'"^)) and Sli{wi) = S[i{wi,y'-~'^*). 



Combining inequality Q into inequality ([T]i then gives us 
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1=1 

^ EQ{E{d{S,Q, SIq{Yq,Vq, Uq))\Q)) 
>Eid{Si,Si{Y,V,U)). 

The bounds on cardinality of the auxiliary random variables follow from standard arguments (see for e.g. jlTTI 
Appendix C]). This completes the proof of converse. ■ 

IV. NoNCAUSAL Estimation with a helper 

Having established the distortion-cost region for the discrete memoryless estimation with a helper problem when 
the interfering signal is causally known, we now turn to the noncausal setting, that is, when 5*2 is noncausally 
known at the encoder. This setting is more complicated and the distortion-cost region is still unknown. In this 
section, we first give an achievability scheme based on the recently proposed technique of hybrid coding ifTSl . We 
then specialize our setting to the case of binary estimation with a helper. 

The problem of binary estimation with a helper is one where Si ~ Bern(pi), 5*2 ~ Bern(p2), < pi,p2 < 1/2, 
X e {0,1},Y = X ® Si® S2 and d(5i, 5*1) = 5i ® ^i, i.e., Hamming distortion. The cost is given by p{X) = 1 
if X = \ and otherwise. The objective of the problem is to design a coding strategy that minimizes the Hamming 
distortion in 5*1. 

Specializing to the case of binary estimation with a helper allows us to derive a number of additional results of 
interest. In subsection IIV-AI we give a (non-trivial) condition on the cost constraint that allows us to achieve zero 
expected distortion. We then show that in the binary case, there is a penalty involved if 5*2 is known only causally 
instead of noncausally. As a result, the distortion incurred in Si is higher if 52 is only known causally as opposed 
to it being known noncausally. In subsection IIV-BI we describe the two lower bounds for the problem of binary 
estimation with a helper and then compare them. In subsection IIV-CI we briefly mention a non-binary setting for 
which we can characterize the distortion-cost tradeoff, and show that symbol by symbol encoding is optimal in that 
setting. 

A. Achievable scheme 

We first give an achievable scheme for the general discrete memoryless estimation with a helper problem based 
on hybrid coding Iil5il . We will extend this scheme to the Gaussian case in the next section. 

Theorem 2. An achievable distortion for the problem of estimation with a helper is given by 

D{C) <infEd(S'i,S'i(C/,r)), 

where the minimization is over distribution p{u\s2) and functions x ~ f{s2,u) and si{u,y) such that 

/([/;y) >/([/; 52), 
Ep{X) <C. 

Sketch of Achievability: The achievability scheme follows that of the hybrid coding scheme given in lITSl . We 
give only a sketch here. The codebook generation consists of generating 2"^^^'^''^^)+^) sequences according to 
Y[i=iPi''^i)- Po'" encoding, given an Sj sequence, the encoder looks for a u" sequence such that (u",S2) G % ■ 
If there is more than one, it selects one sequence uniformly at random from the set of jointly typical sequences. It 
then outputs .t" according to f{ui,S2i) for i G [1 : n]. The decoder looks for the unique u" sequence such that 
{u^\y"-) g Te ■ It can be shown as in IfTSl that the probability of decoding error goes to zero as n ^- 00 if 

I{U;Y)>I{U;S2)+2e. 

The decoder then reconstructs 5" according to si{ui,yi) for i G [1 : n]. 



We now specialize the achievable distortion-cost region in Theorem |2] to the case of binary estimation with a 
helper. The next result shows that, in the binary case, zero expected distortion is achievable under a condition on 
the cost constraint. 

Proposition 1. For the problem of binary estimation with a helper, 

DiCUn = 

if H2{C) > H{X 52|i^), where H2{.) is the binary entropy function, X ~ Bcrn(C) independent of 5*2 and 

Y ^X(SSi®S2. 

Proof: The sufficient condition on the cost constraint follows from a particular choice of auxiliary random 
variable U in Theorem [2] We let X ^ Bcrn(C) independent of 5*2 and let U = X ® 82- The decoder reconstructs 
Si as Si ~ Y S)U ~ Si, incurring zero expected distortion. We now note that the cost constraint is satisfied since 
X ~ Bcrn(C). To satisfy the mutual information condition on the choice of joint distribution, we require 

IiU;Y)>IiU;S2) 

^H{U\S2) > H(U\Y) 
^H{X\S2)>H{X®S2\Y) 
^H2{C)>H{X®S2\Y). 

■ 
Weakening Proposition [T] leads to the following simple sufficient condition for zero distortion. 

Corollary \. If C > pi, D{CU,, = 0. 

Proof of Corollary [T] follows readily from Proposition [T] Since < C,pi < 1/2, if C > pi, then 

H2{C)>H2ipi) 

= H{Si) 

>HiSi\Y) 

= H{X®S2\Y). 

Remark IV.l. A trivial condition for zero distortion is when C > p2 in which case, the encoder just performs 
symbol by symbol cancellation of S2 to allow the decoder to recover Si losslessly. Corollary Q] shows that zero 
expected distortion can be achieved even if C < p2 as long as C > pi. 

By choosing U = {If , V) in Theorem|2] where p{u\s2) = p{u')p{v'\u', S2), we obtain the distortion-cost region 
when 52 is restricted to be causally known at the encodeo A natural question to ask is whether the achievable 
distortion for the same cost constraint can be lowered if S2 is noncausally known at the encoder rather than only 
causally known. This is indeed the case for the problem of binary estimation with a helper. 

Proposition 2. For the problem of binary estimation with a helper, the achievable distortion when S2 is noncausally 
known at the encoder can be strictly smaller than the achievable distortion when S2 is only causally known at the 
encoder, with the same cost constraint. 

Proof: To prove Proposition |2] we exhibit an example where we can achieve zero expected distortion when 
^2 is noncausally known at the encoder, but for which the achievable distortion is strictly greater than zero when 
^2 is only causally known. To this end, we let pi ~ 0.1, p2 ~ 0.5 and C = 0.11. Since C > pi, from Corollary [T] 
an expected distortion of can be achieved when 52 is noncausally known at the encoder. That is, we have 
D(0.11)min-noncausai — 0. Proof of this proposition is completed using the following claim, which states that the 
minimum expected distortion when 52 is only causally known at the encoder, I?(0.11)min-causai, is strictly greater 
than zero. 

Claim 1. -D(0.11)min-causai > ^ for any choice ofU,V satisfying the constraints given in Theorem\l\ 

Claim [T] is proven in Appendix |II] ■ 

'The boundaiy case of I{U; Y) = I{V\ S2\U, Y) is treated in a similar fashion as in the causal setting. 



B. Lower bounds for binary estimation with helper 

We now turn to lower bounds for the binary estimation with a helper problem. The first lower bound that we 
will present uses ideas from ||3] adapted from the Gaussian to the binary setting. 

Theorem 3. A lower bound for the achievable distortion for the problem of binary estimation with a helper is 
given by 

DiC)nun > mmH^^HiSi) + H{S2) - H{Y)) - EX, 

where we define H2 ~ if the argument is negative or greater than 1, and the minimization is over joint 
distribution p{x\s2) such that EX < C. 

Proof: We first start with a simple claim. 

Claim 2. Let Si{y"') be an optimal reconstruction function (with respect to Hamming distortion) for s" and x"(y") 
be an optimal reconstruction for Sj ffi s". Then, d(si, si(y"')) = d{s2 x", x"'(jj"')). 

To prove this claim, observe that rf(s",si(y")) = X)"=i ■^i* ® ^ii(y")- Consider now the function a;^(y") = 
sii(y") © Vi- Since Xi{y^^) is optimal for d{s2 ® x",i;"(y")), we have 

n 

n 
= ^S2^®Xi®h^{y'')®y^ 

i=l 
n 

^^suOsuiy'') 

i=l 

= dK,si(y")). 

Hence, we have d(s2 © a;",x"(?/")) < d(s", si(j/")). For the other direction, consider the function s[^ = 
ii{y") ®yi- Repeating the same arguments for (i(s", si(?/")) instead of d{s2 ® x", z:"(j/")), it is easy to show that 
d{si, si(j/")) < d{s2 ® x", a;"(y")), which completes the proof of claim |2] 

As an aside, the proof of claim |2] shows that the optimal reconstruction functions for the respective problems 
are related by ^"(y") = s"(?/") © y". 

We now continue with our lower bound for the binary case. Using claim |2l we have 

dis'i, s^'CF")) = d(x" © s'^, x"(y")) 

>d(S'j,x"(r"))-d(S'^\x"©s'j). (3) 

The second line follows from the fact that the Hamming distance is a proper distance metric, and it therefore 
satisfies the triangular inequality. Hence, 

-Ed(S'" s"(r")) = iEd(X"©5'2",x"(r")) 
n n 

> -Edrs*" £"(r"))--Ed(S'" x"©s'"). 

n n 

Let Q be uniform [1 : n], independent of other random variables. Then, 



n 

-Ed(5^\x"©5^') = E(-yx,) 



n 

i=l 



= EXq 

= E X (4) 



This is the expected number of ones in X". For the term, E(i(S'J,.T"(y"))/n, we lower bound it by 

n 

-Ed(52",x"(y")) > -^Ed(52.,S2,(r")) (5) 



n 



where S2iY"') is an optimal reconstruction function with respect to Hamming distortion for 82- The right hand side 
of inequality ^ is then further lower bounded by the following argument. From data processing inequality lfT6) . 
we have 

I{Sli-S^)<I{Sq-Y^) 

n 
n 

< nH{Y) - nH{Si). 
On the other hand, 

n 

^(^2"; 4") > ^(^(•52.) - H{S2^ © ^2,)) 
i=l 

>ni^2(52)-ni/2 [if]Ed(52„S2.(r"))] , 

where the last line follows from concavity of entropy lfT6) . Combining the upper and lower bounds gives us 

n 

-y^EdiS2^,S2^iYn) > H^'iH{Si) + HiS2) ~ H{Y)), (6) 



n 

i=l 



where we define iJj ^i) '■= if the argument is negative or greater than 1. 
Substituting (|5]l, (|6]l and (JUi into ^, we have 

D{CUn > H^\H{S,) + H{S2) - H{Y)) - EX, 

where E X < C from the cost constraint. ■ 

Using the lower bound in Theorem |3] we can show that when pi — 1/2, symbol by symbol cancellation of ^2 
is optimal and hence, when pi = 1/2, the minimum achievable distortion for the same cost constraint is the same 
regardless of whether ^2 is known causally or noncausally. 

Proposition 3. When pi = 1/2 and p2 > C, the distortion- cost region is given by 

D{C)min ^P2-C. 

Proof: When Si ^ Bern(l/2), Y ^ Bern(l/2), regardless of the distribution of ^2 ® X. Hence, Theorem |3] 
reduces to 

DiCUn>P2-^X 
>P2-C. 

Achievability of this lower bound follows from Theorem [T]by setting y = 0, C/ to be a random variable such that 

1 w.p. -^ if 5*2 = 1 



P2 

P2 



X = <( w. p. 1 - § if S'2 = 1 



otherwise 



The existence of such a U follows from the functional representation lemma lOTI Appendix B]. It is easy to verify 
that the expected cost constraint is satisfied with this choice of distribution p{x\s2)- The reconstruction function in 
this case is simply 5*1 = Y . It also easy to verify that the distortion constraint is satisfied. ■ 

The optimization problem in Theorem |3] can be simplified in a number of cases. 

Corollary 2. Theorem \3\ simplifies under the following conditions 

1) Under the condition pi + (1 — 2pi){p2 — C) > 1/2, Theorem\3\simplifies to 

D{CUn > H^\H{Si) + H{S2) - H{pi + (1 - 2pi)(p2 - C))) - C. 

2) Under the condition pi + (1 — 2pi){p2 + C) < 1/2, Theorem\3\simplifies to 

D{CUn > H^\H{Si) + H{S2) - H{pi + (1 - 2pi)(p2 - C))) - C. 

Proofi The proof follows from observing that p2 — C < ^ X (B S2 < P2 + C. Define EX (B S2 '■= Px®s2- 
Then, Y ^ Bern(pi + (1 — 2pi)px^s2)- If condition one in the corollary is satisfied, then H{Y) is a decreasing 
function of p^sis-i- It is then easy to see from the expression in Theorem |3] that the minimizing distribution is one 
where Px®s2 = P2 ~ C and E X = C. A similar proof applies for the second condition, which completes proof of 
this corollary. ■ 

It appears to be quite difficult to obtain an explicit analytical solution for the general case of pi + (1 — 2pi){p2 — 
C) < 1/2 < pi + {1 — 2pi){p2 + C). A looser bound in this case is 

Corollary 3. 

D{CUin > H2\H{Si) + H{S2) - 1) - C. 

Proof of this corollary is omitted as it follows directly from Theorem [3] 

We now present another lower bound for the binary setting, using ideas from the proof of converse for Gel'fand- 
Pinsker coding given in Mli Chapter 7], and also ideas from US]. The main intuition in this lower bound comes 
from Claim |2] used in the proof of Theorem |3] which shows that the optimum distortion incurred in reconstructing 
X (B S2 is the same as the optimum distortion incurred in reconstructing 5*1. We then try to lower bound D{C)niin 
by lower bounding the distortion incurred in reconstructing X © S'2. We will see in the sequel that in some cases, 
this lower bound is better than the previous lower bound given in Theorem |3] 

Theorem 4. A lower bound for the achievable distortion for the problem of binary estimation with a helper is 
given by 

DiCUn > inmH^\HiSi) + H{X ® S2\U) + I{U- S2) ~ H{Y)), 

where we minimize over p(u\s2) and x — /(u, S2) such that YjX < C. The cardinality of the auxiliary random 
variable U may be upper bounded by \U\ < \S2\{\X\ — 1) + 2. In the binary case that we are interested in, \U\ < 4. 

Proof: For notational convenience, let Z represent the optimal reconstruction for X Q) S2 and Z = X (B 82- 
From data processing inequality, 

/(Z";Z") </(r";Z"). 

On the one hand, 

/(Z"; Z") > iJ(Z") - iJ(Z"|Z") 

n 

>Y,{H{Z,\Z'-^) - H{Z,(B Zi)) 

i=\ 
/ -J n 

n 

= Y^{H{Z,\Z^-\Sl,^;)+I{Z^-^■S2^\Sl.,^^y) - H2iD{CUn,) 

i=l 
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J2{H{Z,\Z■'-\Sl,^,)+I{Z■^'\Sl,+,;S2^)) - H2{D{CU^) 



n 



- ^{H{Z,\Ui) + I{U,-S2r)) - H2{D{C)mS..). 
i=l 

In (a), we used concavity of entropy and Claim |2] which states that the optimum distortion for X ® S2 is the same 
as the optimum distortion for 5*1. Next, 

/(y";Z") = i?(y") - H(y"|z") 

n 

n 
i=l 

Defining the standard Q uniform random variable over [1 : n] independent of other random variables, U = {Uq , Q), 
Yq = Y, SiQ ~ Si, S2Q ~ S2 and Zq = Z then gives us the following lower bound 

D > H^\HiSi) + H{Z\U) + I{U- S2) - H{Y)), 

where we minimize over p{u\s2)p{x\u, S2) such that EX < C. Reducing the cost constraint to this single letter 
expression {FiX < C) follows the same procedure as in Theorem |3] 

Next, we note that instead of minimizing over p{x\u,S2), it suffices to minimize over x = /(u, S2). To see 
this, note that we can always find a V, independent of C/, S'2, such that p{x\u,S2) = /(u,w,S2). Now, define 
U — ([/, V). Observe that since we preserve both p{x © S2) and p{x), the cost constraint and H{Y) — H{Z © Si) 
remains unchanged. Now, note that 

H{Z\U) < H{Z\U), 

and 

I{U;S2)^I{U;S2)+IiV;S2\U) 
= I{U;S2). 

The bound on the cardinality of U follows from standard techniques and we omit it here. This completes the proof 
of the lower bound. ■ 

Theorem |4] involves minimizing over joint distributions and choice of auxiUary random variable U. A looser 
bound that is easier to compute is given by the following corollary. 

Corollary 4. 

D{PUin > H^\H2{pi) + mill {H2ia)-H2ia*pi)}-IiU;Z) + IiU;S2)). 

for some joint distribution p(u\s2) and x ~ f{u, s) satisfying F, X < C, and Z ~ X Q) S'2. 

In Corollary m we need to perform maximization of I{U; Z) — I{U; S'2) subjected to a cost constraint EX < C. 
This is nothing but the problem of maximization of the capacity of a Gel'fand-Pinsker channel subjected to a cost 
constraint. There are efficient numerical algorithms for performing this maximization, cf. ifTT] Page 555-556] for 
a description of the algorithm. 
Proof: 

Starting from Theorem |4] consider the term H{Z\U) + /([/; S'2) — H{Y) in the Theorem. 

H{Z\U) + I{U; S2) - H{Y) = H{Z, U) - H{U\S2) - H{Y) 

= H{Z) - H{Y) + H{U\Z) - H{U\S2) 
= {H{Z) - H{Y)) - {I{U- Z) - /([/; S2)). 
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We now minimize the terms {H{Z) — H{Y)) and —{I{U;Z) — /([/; 52)) separately. We have discussed 
maximizing the term I{U; Z) — I{U; S2) earher. As for the term {H{Z) — H{Y)), using the observation P2 — C < 
F,Z < P2 + C, we have 

mm{HiZ) - HiY)} = min {H^ia) - H^ia * p^)} , 

P2—C<a<p2 + C 

which completes the proof. ■ 

Comparison of lower bounds 

As we mentioned, the expressions in Theorems |3]and|4]can be difficult to compute. For the purpose of simulations, 
we compare the expressions of Corollary |2] with those of Corollary H] when the conditions given in Corollary |2] are 
satisfied. Note that since Corollary |4] can be weaker than Theorem |4] whereas Corollary |2] gives the same bounds 
as Theorem [3] when the conditions are satisfied, an advantage of this comparison is that it shows when Theorem |4] 
can be strictly larger than Theorem |3] 

For our numerical example, we set p2 = 0.1, vary the cost from 0.01 to 0.03 and compute plots for pi = 0.05, 0.09. 
In general, the bound in Theorem |3] is better, but we focus on small values of cost, pi and p2 to show that there 
are regimes in which the expression in Theorem |4] is better. The plots are shown in Figures |3] and |4] As can be 
seen in Figure [5] there are regions for which Theorem 5] is strictly better than Theorem |3] However, Theorem |3] 
does give a better bound for a wider range of values as compared to Theorem |4] 



pi =0.05 



- Corollary 2 

- Corollary 4 




D 
D 



0.01 0.012 0.014 0.016 0.018 0.02 0.022 0.024 0.026 0.028 0.03 



Fig. 3: Comparison of bounds forpi ~ 0.05. Y-axis represents the distortion level while X-axis represents the cost. 



C. Erasure estimation with helper 

For most of this section, we have focused on the binary estimation with helper setup. In this subsection, we 
briefly mention a setting, erasure estimation with helper, for which we can characterize the distortion-cost function 
and also, for which symbol by symbol cancellation of ^2 is optimal. 
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Fig. 4: Comparison of bounds for pi = 0.1. Y-axis represents the distortion level while X-axis represents the cost. 
In this case, the bound given by Corollary |2] is strictly better than that for Corollary 2] 



Y 



The setting is defined by Si ^ p{si), 5*2 ^ Bcrn(p2), X e {0, 1} and Y is defined as follows 

Si if X e 52 = 
e if X®S2 = l ' 

This is a model of a channel in which when the interfering signal is large, the desired signal is erased. When the 
interfering signal is small, decoder receives the signal perfectly. The helper tries to help the decoder by canceling 
the interference. The distortion-cost region is characterized by the following proposition. 

Proposition 4. The distortion-cost region for the problem of erasure estimation with helper is given by 



mmlP{X®S2 ==l)(miiiEd(S'i,si)) 

si 

hP(X®S'2 =0)(niinEd(5i,si))|., 



where the minimization is over p{x\s2) satisfying E p{X) < C. 

Proof: Achievability of the distortion-cost region uses a modified version of the achievability scheme used in 
Proposition |3] The modification comes in the reconstruction function where 

arg miua; d{Y, x) \fY = Si, 

arg miria; E d(5'i , x) if y = e 

With this choice of reconstruction function and noting that V{Y = Si) = V{X ® Si =0) and V{Y = e) = 
V{X 5i = 1), it is easy to see that the achievable distortion-cost region simplifies to the expression given in the 
Proposition. 

For the converse, fixing a (n, C) code achieving distortion D, we have 



h{Y) 



n 

z? = -VEd(5H,5i,(r")). 
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Consider now the term Ed{Sii, 5*1^(1"")). We have 

Ed(^H,^H(r")) = ^p(5H,y"\\y,)d(si.,SH(y")) 

= Y. (p(sn: 2/"^'- Vu x®S2i^ 0)rf(sii, SH(y")) 

+p{su, y"\\ y^, xes2i^ l)d(sH, sii(y"))) 

= Yl {v{Sl^,V"^\V^,X®S2^ = 0)d(sH, S^ (j/")) 

+p(si,, y"\\ a; ® S2» = l)rf(sii, si»(y"\*, y. = e))) 

= ^p(su, X © S2i = 0)p(y"\', yj|sH, x ® S2^ ^ 0)d(si,, SH(y")) 

+ X!p('*i''y"^*'^®'*2i = l)d(sii,si,(y"\*,y, = e)), 

where (a) follows from the fact that when x®S2i = 1, Hi = e. Next, focusing on the first term in the sum, we note 
that P{Yi = Sii\X 52 = 0, 5*1,;) = 1. Hence, using 1{ j to denote the indicator function, the first term simplifies 
to the following 

^p(sij, x®S2i = 0)p(y"\\ 2/i|sii, a; ® S2j = 0)d{su, sii(y")) 

= ^p(sii,X©S2i = 0)p(j/"\*|sii,X® S2j = 0, yi)lj;,=si,d(.SH,SH(j/")) 

> Yp{sii,x®S2i = 0)p(y"\*|sii,x©S2j = 0, j/i) min(i(sii,x) I 

= Y p{sii)p{x ® S2i = 0) ( min (i(sH, x) 

\xeSi 



= P{X, (BS2^ = 0)E[ min d{su, x) 
\xeSi 

Hence, Ed^Su, 5*1^(1"")) is lower bounded by 

Ed{Su, Su{Y")) > P{X, ® 52,; = 0) E f min d{su,x) 

\xeSi 



^p(sij,2/"\\a;®S2, = l)d{su,su{y"\\y, = e)), 



P{Xi ® 52i = 0) E ( mill d{su,x) 
\xeSi 



X!p(^i*)p(y"^'' ^ ® ^2, = l)rf(sii, sh(?;"\', y, = e)) 



> P(Xi ® 52i == 0) E ( min d{su,x) 

\xGSi 



^p(y"\',a;®S2, = 1) ( min ^p(sH)d(sii,x) ) 



= P{X, ® 52, = 0) E min d{si,,x) 
\xeSi 

+ y P(X, ® 52, = 1) (min E d{Su,x) 
^-^ \xeSi 

We note now that if y^ = or 1, then we can achieve the minimum possible distortion min^j d{si, si) using only 
knowledge of yi, since su is known in this case. 
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We therefore obtain 



n 

n ^ — ' 
>-Y^ {^{X^ ® S2r = 0) E ( min d{su, x) 



" S V V-eSi 



Defining Q ^ Unif[l : Ji\ independent of other random variables then give us 

D > P{X Si 82 = l)(minEd(S'i, si)) 

Sl 

+ P{X ® 52 = 0) E(min ^(51,51)). 

Sl 

For the cost constraint, we have 

1 " 
E(-^X,)=EXq 

= EX, 
which completes the proof. ■ 

V. Gaussian Estimation with helper 

In this section, we extend our setup to the Gaussian case, where 5*1 ^ iV(0, 1), S2 ^ N{0, P2), Y — X + S1 + S2, 
d{Si, Sl) = {Sl — Si)^ and the cost constraint is EX^ < P. As we mentioned in the Introduction, the problem in 
the Gaussian case is equivalent to the problem of Assisted Interference Suppression considered in 0. We present a 
new lower bound for this problem that can improve on that derived in lO and lH. The lower bound derived in f6l 
includes the lower bound derived in ||3] as a special case and can be strictly better, but for clarity of presentation, 
we will first compare our lower bound to that in [3 1 in subsection I V-CI and then compare our bound with the lower 
bound derived in [6 1 in subsection IV-DI We begin with an achievability argument based on Theorem |2] 

A. Achievable distortion-cost region 

We specialize Theorem |2] to the Gaussian case by choosing the auxiliary random variables as Gaussian random 
variables. The achievability scheme presented here is essentially the same as the scheme presented in fS), but we 
derive it via different means. 

Theorem 5. An achievable distortion for the problem of Gaussian estimation with a helper is given by 

E[/2 



where 



'P' 



DiPUn < inf 1 E y2 E (72 - (E UYy ' 
-2a/3\/P+ ^4:a^P^P + 4:{l-a^)P 



E t/2 = P' + 27/3 /P^ + 72^2 



E{UY) = apVPP +P' + a7/P^ + 7/3/P^ + /3/P^ + 7P2, 
E r^ = P + 1 + P2 + 2a^J~PF2 + 2/3/P^. 
and the infinum is taken over — l<a<l, — 1</3<1 <^nd 7 G 7?. satisfying the constraint 
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We defer the proof of Theorem |5] to Appendix HID 

Similar to the binary setup, we can derive a nontrivial condition between P and the power of the source Si 
(normaUzed to 1), such that zero expected distortion can be achieved. 

Proposition 5. For the problem of Gaussian estimation with a helper, D{C)mi-a — if 

P>1 ^. 

P + P2 + I 

Proof: Proof of this Proposition follows from a choice of a and /3 in Theorem |5] However, we give a slightly 
different proof that gives more intuition to this condition and also has parallels with the problem of dirty paper 
coding Oil (see also Hi] Chapter 7]). 

Starting from Theorem |2] we let U — X + S2, where X ^ N{0,P) independent of 82- Note that the cost 
constraint is satisfied from this choice of U. If the decoder can decode U, then the distortion incurred is zero, since 
Si = Y — U. It therefore remains to satisfy the decoding condition, which is 

IiU;Y)>I{U;S2). 
Since all the random variables are Gaussian, this decoding condition reduces to h{U\S2) > h{U\Y). 

h{U\S2) = h{X\S2) 

= Uog2TTeP. 



On the other hand. 



h{U\Y)^hi-Si\Y) 

(a) 



hi-Si-E{-Si\Y)) 
■ log 27re I 1 



2 ° V P1+P2 + I, 

where (a)follows from the fact that for Gaussian random variables, the difference between 5*1 and its Minimum 
Mean Square Error Estimator is independent of the observation, Y. 
We therefore derive the condition 

1 



P> 1- 



P + P2 + I 

■ 
Note that, similar to the binary case, the expected distortion can be made to be zero even if P2 is much larger 
than P. 

B. Lower bounds 

We now turn to lower bounds for the problem of Gaussian Estimation with helper. We first state the following 
lower bound given in ||3] and its improved version given in ||6] . 

Theorem 6. 421/ A lower bound for the problem of Gaussian estimation with helper is given by 

2 



D{P)r 



^' -Vp 



P22 + 2/Pi^ + P + l 



where [.]+ denotes the positive part. 

As shown in IJ61, the lower bound given in Theorem |6] can be improved to the following. 
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Theorem 7. ^61/ A lower bound for the problem of Gaussian estimation with helper is given by 



D{P) nun > inf sup 



1 



crxS2 7>0 7 



P9. 



I + P2 + P + 2<jxs2 



V(l - l)^P2 + l^P - 27(1 - 7)^xs. 



where [.]+ denotes the positive part and crxS2 G [— ■\/^'\AP, V^^V?]- 

From the lower bound in Theorem |6] and Proposition |5] we can show that as the power of the interfering signal 
goes to infinity, P2 -^ 00, zero expected distortion is achievable if and only if P > 1. 

Proposition 6. limp2_i.oo -D(-P)min ~ if and only if P > I. 

Proof: From Proposition |5] the sufficient condition for zero expected distortion reduces to P > 1 as P2 



> 



1 -Vp 



00. 
2 



From Theorem|6] we can show that this is also necessary. From Theorem|6] limp^^oo D{P)r 

which is zero if and only if P > 1. 

We now turn to our lower bound. For clarity, we first present a proof of a special case of our lower bound before 
turning to the more general expression. 

Proposition 7. A lower bound for the problem of Gaussian estimation with a helper is given by 



D{P)r 



> 



1 



(7-1) 



In 



1+7^2 

I + P2 



2a/P 



2y/P 



/7^(1 + 7P2) V7^(i + P2) 



-7P 



for any 7 > 1. 



It should be noted that while finding the optimal value of 7 that maximizes this lower bound is a hard optimization 
problem, any 7 > 1 constitutes a lower bound for -D(P) min- Hence, Proposition [T] in fact gives a family of lower 
bounds. 

Proof: This proof hinges on an application of a relationship between mismatched estimation and relative 
entropy given in ||7] Equality (14)]. The main idea behind the proof lies in considering a decoder that performs 
the estimation (and reconstruction) using a wrong (or mismatched) distribution for Ps"\y"- In particular, we will 
consider a mismatched decoder that attempts to estimate 5" assuming that X" = 0. That is, the decoder assumes 
that the encoder does not do anything to help the decoder. The estimation error incurred by the mismatched decoder, 
AISEq, is clearly larger than that incurred by an optimum decoder that uses the correct (true) distribution, P'(P)min- 
We then rely on results in Q to lower bound the difference between £'(P)min and MSEq, thereby giving us a 
lower bound on P)(P)min. 

To derive our bound, we first consider a more general source Si ^ iV(0, 1/7) and let 5*2 ~ ^(0, P2) as before. 
The value of 7 that we are concerned about is 7 = 1, which will appear later in the proof. 

Define MSEQij) as 

2 



MSEq{j) 



E 



S']" 



p 



iX" + S^ + S^) 



Let a 



:+P2 



and note that Si = aY is the Minimum Mean Square Error (MMSE) estimate of Si that the 



decoder would employ if it assumes that X" = 0. We first give a lower bound for MSEq{'^). Note that under 
the true distribution, E ||X"|p < nP. 

E IIS-i" - a{X" + S-i" + 5^')|p = E US'!' - a{S'^ + S-Jjlp - 2aE < fi-J' - a(S'i" + S'J),X" > +a^ E ||X"||2 



naPo 



2a2E<S'" X" >+a2E||X"|P 



(a) 

> naP2 - 2a' 



E||52"IPE||^"IP 



a2E||X"||2 



> naP2 - 2q Vn^^2-P 



yP2-2na^y/P^, 
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where (a) follows by Cauchy-Schwartz inequality. 

Now, let S" = 5*2 + X" and let Pg„ denote the distribution of 5^ + X" under the optimum encoding scheme. 
Let Qgn denote the corresponding distribution under the encoding scheme of X" = 0. Note now that 

MSEq{j) = E ||5r - a{S'^ + ^")||2 

(J E ||y" - s" - (y" - EQ(S'"|y")|p 

= E||5"-EQ(^"|r")||2 

:=MSEqs4^). (7) 

(a) follows from the fact that a{S" + S*") is the optimum MMSE estimator for 5" under Q; that is, under the 
assumption of X" = 0. (b) follows from 5f = F" - S". 

Next, note that this analysis also holds when the decoder knows that 5" is distributed according to Pj„ . That 
is, we have 

MMSE{-f) := E\\S^ - Ep{S['\Y'')\\^ 
= E||5"-Ep(5"|y")|p 
■=MMSEp^^4-f). (8) 

Note that nD{P)„,in ^ MMSE{1). 

We now relate MSEq{j) to the optimum MMSE of 5" given that an optimum estimator and coding scheme 
were used. From ^ and (jSJ, we see that it suffices to consider MSEq g„{'j) and MMSEp g„{'-^). Using the 
relation between mismatched estimation and relative entropy given in Q Equality 14], we have 

2?(P^T?'||Ql7°') = -y^ MSEQ^^^{^)-MMSEp~,^{^)d^ (9) 

Here, Py^ represents the distribution of F" induced by Pg„- Similarly, Qy™ represents the distribution of y" 
induced by Qg,^. 

We first give a bound on i:»(P^I^||Q^i). 

Note that since X" is a function of S*^, we have the following. 

Under P^}^^^ : r"|5^' - A^l^a" + ^", -^«> 



7 
Under g^^^|^„ : r"|52" ^ iV(5^ f /„x„). 



1(7) . \rn\ c" "" 



7 



>(7) iintT) 



Hence, P'CPyTim.iHQy'nign) is given by the divergence between two multivariate Gaussian random variables 
with the same covafiance matrix. In our case, the divergence is given by 

D{P^%s.\\Q%s.)^\^\\X-f. 

Hence, 

3(7)|in(7)\ / -p„„ n^p(7) iir)(7) 



DiP^^'WQV^) < E5J DiP^'JissWQy^lss) 
n"fP 



= Es"(-||X" 



< 
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From (|9]l, we have 
for 71 > 70. Hence, 

J'Jg 

> f ' MSEQg„{j)dj - nj,P (10) 



Since MMSEp §„ (7) is a non-increasing function in 7, we have /''' MMSEp g„ (7)^7 < (71 -^o)MMSEp g„ (70) 
(71 — jo)MAISE{'yo). Next, we note that a = 1/(1 + 7P2), so we can write 

A/5i?Q^^„(7) = A/5i?Q(7) 

nP2 2n^fP^ 



> 



1 + 1P2 (1 + 7^^ 



2; 



\2- 



From ([Tol l and the arguments above, we have 



(71 - 7o)MA/5£;(7o) > / ^— ^ - ,, _; '2 ^7 - ^71^ 

J JO 1+7^2 (1+7^2)^ 

, ,l+7i-P2, , 2nVP 2nVP 
= n In — - H := ^= 7171 P- 

^1+70^2^ 7^(1+71^2) VT^{l + 7oP2) 

Finally, using the relationship that £'(P)min = MMSE{l)/n, 70 = 1 and the above completes the proof of the 
lower bound. ■ 

In Proposition |7] we related the minimum mean square error that a decoder incurs when it uses the true distribution 
to the mean square error incurred by a decoder if it uses the possibly erroneous distribution of X" = 0. Clearly, 
we do not need to choose X" = as the erroneous distribution, but we can also choose other distributions. This 
is the main idea behind our generalization of Proposition |7] which we state in Theorem [8] 

Theorem 8. A lower bound for the problem of Gaussian estimation with helper is given by 

, 1 + 7^/ ^ 1 1 



(7 - l)^(P)mi„ > log(^^V) + 

:P2 



l + Pi (1 + 7^/) (1 + ^/) 

P2 , P2 c^l 



P/(1+7P/) Pi{l + Pi) l+7rP 

T- -fs + 1 + log(TT ^) 

1 + 71 rP 1 + 71 rP 



ax 



'2 - bx* 



where a — p^^^_^_p^-^ Pj(i+^p^) i+.yrP' " ^ |2(pi(i+Pi) Piii+jPi) + i+7-rp)l v -^2 and 

y/P ifa<0 

b/2a ifa>0 and b/2a < \/P , 

VP otherwise 



for any 7 > 1, real number c and r > 0. 

As with Proposition |7] Theorem |8] gives a family of bounds. Any 7 > 1, real number c and r > yields a bound 
on the achievable distortion. Theorem |8] is proved in Appendix |IVl 
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C. Comparison of bounds I 

We now show some plots comparing the various bounds we derived with the lower bound proposed in ||3] 
(Theorem |6]l. For the purpose of comparisons, we set P2 at a fixed level and vary the power of the encoder. 
We then compute the lower bounds on distortion given in Theorem |5] Proposition [T] Theorem |8] as well as the 
achievable distortion given in Theorem |5] 

The plots for P2 ~ 0.1, P2 = 1 and P2 = 10 are shown in Figures |5] |6] and |7] respectively. As we can see from 
the plots, the generalized lower bound in Theorem |8] can significantly improve on the lower bound of Theorem |5] 
for several different levels of P2- 



Comparison of bounds for P = 0.1 




0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 



Fig. 5: Comparison of bounds for P2 
constraint. 



0.1. Y-axis represents distortion level and X-axis represents the power 



D. Comparison of bounds II 

In this subsection, we compare our lower bound given in Theorem [8] to the lower bound given in ||6l (Theorem 
|7]i. For ease of numerical computation, we compare our lower bound to the following upper bound on Theorem |7] 

2 



D{P)r 



> 



1 
mm sup —;r 



P?, 



1 + P2 + P + 2ct 



XS-2 



V/(1 - lfP2 + l^P - 27(1 - l)<JXS2 



(11) 



where [.]+ denotes the positive part and ^ is a discretization of the interval [~y/^VP,V^VP]- The plots 
showing comparisons of the lower bound proposed in Theorem |8] and the lower bound given in inequality (fTTT i for 
P2 = 1, 10, 100 are given in Figures |8] |9] and [TOl respectively. 

As can be seen from the plots, the two bounds now cross each other. While the lower bound given fSI can be 
better than that given in Theorem |8] in some regimes, we can also see that Theorem |8] can be strictly better than 
Theorem [T] in other regimes, particularly when P2 is large and the power budget P of the encoder is small. 
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Comparison of bounds for P = 1 




Theorem |5] 
Theorem [8] 
Proposition 
Theorem [6| 
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Fig. 6: Comparison of bounds for P2 = 1. Y-axis represents distortion level and X-axis represents the power 
constraint. 



Comparison of bounds for P, = 10 
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Theorem |5] 
Theorem |8] 
Proposition 
Theorem [6] 
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Fig. 7: Comparison of bounds for P2 = 10. Y-axis represents distortion level and X-axis represents the power 
constraint. 
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Fig. 8: Comparison of bounds for P2 = 1. Y-axis represents distortion level and X-axis represents the power 
constraint. 



Comparison of bounds for P, = 10 
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Fig. 9: Comparison of bounds for P2 = 10. Y-axis represents distortion level and X-axis represents the power 
constraint. 
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Fig. 10; Comparison of bounds for P2 = 100. Y-axis represents distortion level and X-axis represents the power 
constraint. 



VI. When Si is also available at the encoder 

In this section, we turn our attention to the problem of reconstructing Si when both Si and S2 are available at 
the encoder, as defined in Section Ill-BI As with previous sections, the focus of this section is on lower bounds 
for this setup, but we also use lower and upper bounds to derive constant multiplicative gap results between the 
achievable distortions and lower bounds. As we mentioned in the Introduction, our setting is a special case of the 
setting considered in |fT2| . We first review some known results found in that paper specialized to our setting, and 
then present our results, which include a generalization of the lower bound il2i that can be strictly larger. 



A. Upper and lower bounds 

We first present an achievabihty scheme for this setting. 

Theorem 9. (See also 4721/ ) An acheivable distortion-cost region for the problem of estimation with a helper who 
has non-causal access to both the interference and the signal is given by 

Pi 



D{P). 



< 



1 



Pi 



(^/3^^ + lj P2+P(l-a^ 




where we minimize over — 1<q;<1> ^1</?5:1 cmd < a^ + (3^ < ijj 

As the achievability scheme is largely the same as that in IIT2I . we only give a sketch in Appendix [V] 
We now turn to lower bounds on the distortion-cost region. We first present without proof two lower bounds in 
the following two propositions. For their proofs cf. llTZI or the proof of Theorem [Tol below. 

^In II2I . the authors minimize only over 0<a<l, 0</3<l and < a'^ + /3^ < 1, but it is easy to see that their proof carries over to 
the range stated in this Theorem. 
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Proposition 8. A lower bound for the problem of estimation with a helper who knows both the interference and 
the signal noncausally is given by 

Pi 



D{P\ 



1 + ft (1 + ^) 



Remark VI.l. When P2 — >■ 00, we see that I?(P)min ^ . V • This bound is achievable by noting that for 
/3 = 0, we have D = 7— V in Theorem [9] Thus a separation scheme is optimal when P2 — > 00. 

^1 AT 



a 



Proposition 9. A lower bound for the problem of estimation with a helper who knows both the interference and 
the signal noncausally is given by 

Pi 



D{P)r 



2_(_ (x/P+yn) 



N 



Remark VI.2. As P2 — > 0, our setting reduces to that of state amplification HTj. From the results therein, the 
bound of Proposition\9\is optimal when P2 — > 0. 

We now present our lower bound. 

Theorem 10. A lower bound for the problem of estimation with a helper that knows both the interference and the 
signal noncausally is given by 

for any a € TZ, a ^ 0, where MSE{a) is given by the optimum value of the following convex (quadratic) 
optimization problem: 

p 1 /I \2n , Q/i X .AT {{'^-OL)aP2+apxs2+ Pxstf 

max P + (1 - a) P2 + 2(1 - a)pxs2 +^ ^5— — 515 • 

|pxsil<VPPr,|pxs2l<VP7¥ Pi+a''P2 

It can be shown that setting a = 1 and a — > 00 recovers the bounds in Propositions |8] and |9] respectively. The 
cases of a = 1 and a = 00 correspond to supplying 5*1 + S2 and 52, respectively to the decoder and then lower 
bounding the distortion. 

Note that while finding the optimum value of a may be difficult. Theorem [TO] gives a lower bound for every 
a. We note also that while computation of the lower bound requires solving an optimization problem for each 
a, unlike the lower bounds in Propositions |8] and |9] the optimization problem is quadratic and can be efficiently 
solved fT9ll. J20l. 

Proof: The idea in the proof of Theorem [TO] lies in giving side information Si + aS2 to the decoder instead of 
just 5*1 + ^2 or 5*2 as in Propositions [8] and |9] respectively, and then a more careful bounding of the terms appearing 
in the distortion calculation using Linear minimum mean square error estimation and convex optimization. 

From the data processing inequality, 

I{S'l- S1\S1 + aS^) < /(5r;y"|5^ + aS^) 

n 

n 



< Y, HY,\Su + aS2i) - - \og2TTeN 



i^) n 

< nh(Y\Si + aS2,Q) - -\og2TreN 

Ti 

< nh{Y\Si +aS2)- - log 2TreN. 



24 



In (a), we defined Q ^ Unif[l : n] independent of all other random variables and Yq = Y, Siq ~ Si, S2Q = 5*2 
and SiQ ~ Si. On the other hand, we have 

n 

/(^r; S^\Si + aS^) = J2 h{Su\Su + aS2^) - h{S^\S^, 5i" + aS^) 

n n 

> ^ h{Su\Si, + aS2i) - Y. KSu\Su) 

n n 

> ^ h{Sii\Sii + aS2i) - ^ h{Su ~ Su) 

i=l i=l 

> nh{Si\Si + aS2) - - \og2TTeD{PUin 

= - log 27re -^ log27reD(P)min, 

2 ^\ Pi+a2P2/ 2 ^ ^ ' ' 

where (a) follows from concavity of differential entropy and the property that a Gaussian distribution maximizes 
the differential entropy for a given second moment. Therefore, 

i log (2TTe-^^^) - I log 2TTeD{PUn < hiY\Si + a^a) - ^ log 2TreN 
2 \ Pi + a P2 / 2 2 



= h{X + (1 - a)S2 + Z\Si + aS'2) - i log27reA^ 

< h{X + (1 - a)S2 + Z- k{Si + aS2)) - ^ log27reiV, 

where k is defined as 

, _ (1 - Q;)aP2 + apxs2 + PxSi 
Pi+ a2P2 
with EX Si := pxSi and EXS2 := Pxs2- From Cauchy-Schwartz inequality and the power constraint on X, 

\pxsA < VPP^i and |pxsj < /P^- 
Continuing with our bound, we have 

h{X + (1 - a)S2 + Z- k{Si + aS2)) < ^ log(2^e(E(X + (1 - a)S2 + Z- k{Si + aS2)f)). 
In turn, we have 

E{X + (1 - a)S2 + Z- k{Si + aS2)f = P + (1 - afP2 + 2(1 - a)px5. 

^_ ((1 - a)aP2 + apxS2 + PxsJ^ 
Pi + «2P2 
:= MSE{a,pxSi,Pxs2)- 

Note now that for a fixed, MSE{a, pxsi, PXS2) is a concave (quadratic) function of pxsi and pxs2^ and the 
constraints IpxSi I < yPPsi and |pxS2 1 ^ ■\/PPs2 are linear constraints. Hence, we can find the maximum value 
using convex optimization. Letting p*^g and p^cs denote the optimal solutions to the optimization problem, we 
arrive at the lower bound for the achievable distortion: 

( 0'^PlP2 \1\T 
D{PUn > ^^^^^^ 



MSEia,p*^S^,p*xsJ' 
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Fig. 11: Comparison of bounds for estimation with a helper that knows both the interference and the source. This 
figure gives a plot of the various bounds on distortion for different values of P2 



Comparison of bounds 

As we mentioned earlier. Theorem [TOl includes the bounds in Proposition [8] and |9l It can also be larger, as we 
now show in an example. 

Let Pi = 1, N = 1 and P = 1. We vary P2 and compare the bounds obtained with different values of P2- The 
plots comparing the various upper and lower bounds are given in Figure [TT] As can be seen from Figure [TT] the 
lower bound given by Theorem [Tol can be strictly better than that given by previous lower bounds. As we noted in 
the proof of Theorem [TO] the improvement comes from two aspects: giving 5*1 + 082 to the decoder and a more 
careful bounding via Linear Minimum Mean Square Error Estimation and Convex Optimization. The reader may 
ask whether it is necessary to use Si + aS'2 instead of just setting a = 1 or a ^ 00 and calculate the bounds 
more carefully using Linear Estimation and Convex Optimization. In our simulation, we noted that for some values 
of P2, moderate values of a, such as a = 2, 3 give better bounds than a = I 01 a = 20. This shows that using 
Si + aS2 does lead to better bounds than using Si + S2 or ^2 alone. 



B. Constant gap results 

In our simulations, we note that the upper bound and lower bounds appear to be quite close. This suggests that 
constant multiplicative gap results on the distortion may be possible, under some conditions on the input, source 
and interference powers. This is indeed the case as stated in our next result that when the interference power is 
larger than a threshold (that depends on the system parameters), the lower and upper bounds are within a constant 
multiplicative gap. 

Theorem 11. If 



'P2> 



-f^P + 7Vp(2 + 7VP)(P(1 - 72) + N)- jVp 



jVP{2 + 7\/P) 



(12) 



with 7 = 



- . / 4P+N) 



< £ < 



then the multiplicative gap between the upper bound in Theorem^ -D achievable. 
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and the lower bound in Proposition^ D\\^, is at most 1/(1 — e). That is. 



-^achievable ^ ^ 



have 



i^ib - 1 - e 
Proof: We begin the proof by evaluating the distortion achieved by Theorem |9] for a = —j3 = J 

e{P + N) 



P(l-a2_^2N p 



(1-^)1 + 



P 



P J ' ' \ N 

Now from the condition on P2 stated in the Theorem (see (fT2]i). it follows that, 

P2aVP{2 + aVP) + v/^2aVP - P(l - a^) - A^ > 
=> P2aVP{2 + aVP) + \fPi{2 + aVP)aVP - a^P - P(l - 2a'^) - N >0 



P2 



<^ ^ (2 + cn2WP _ £^ 




«Vp+^') (2 + aVP- 



P9, 



[aVP+lf 



> P{1 -2a^) 

> P{1 - 2a^) 

> P{l-2a^) 
1 




(a/P+l)2Pi 



i-a^^ + 1)2P2 + P(l - 2a2) + N 
P(l-a2-/32) 



> 



P2 



{13^^ + IYP2 + TV + P(l - a2 _ /32) 
which implies 




e(P+JV) 
2P 



TV 



> (1 + ^1 (1 



-^achievable ^ ^ 



Dib 



1-e 



P 

iV 



We 

(13) 



TV 



N 



N 



(1-e), 



VII. Conclusion 

In this paper, we defined and analyze the problem of estimation with a helper that knows the interference. In 
the discrete memoryless case when the interfering signal, 52, is known causally at the encoder, we characterized 
the distortion-cost region. When S2 is known noncausally, we proposed an achievability scheme based on hybrid 
coding. In the binary estimation with a helper problem, we also proposed two lower bounds. Using the upper and 
lower bounds, we characterized the distortion-cost region when the problem parameters C, pi and p2 satisfy one 
of several nontrivial conditions. 

In the Gaussian case, we derived a lower bound based on a recent result by Verdu between divergence and 
mismatched estimation. We showed through numerical simulations that this lower bound can be strictly better than 
previous lower bound derived in |3]. Similar to the binary case, we also characterized the distortion-cost region 
when the problem parameters P, Pi and P2 satisfy one of several conditions. 

We also extended our analysis in the Gaussian case to consider the case when the helper knows both Si and ^2 
noncausally. In this case, we derived a lower bound that contains previous lower bounds proposed in llTZI and can 
be strictly better. We also obtained constant multiplicative gap results for this setting. 

In deriving our lower bound for the Gaussian case when only the interfering signal is known at the helper, we 
used a relationship between mismatched estimation and divergence. In the discrete case, a relationship between 
divergence and Hamming distortion exists too. One such relationship is Marton's inequality ET\ Lemma 6.3]. An 
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interesting open question is whether one can use such relationships to derive a lower bound for the binary case 
that is strictly better than the bounds we proposed. 
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Appendix I 
Sketch of Achievability for Theorem[T] 

We use block Markov coding over B blocks. The scheme in each block is basically a separation scheme, where 
we use the random variable U for transmission of a message from the previous block. The message itself is a 
Wyner-Ziv description ll22l of 5*^ from the previous block. More concretely, in each block j G [1 : B], the trans- 
mission codebook is generated as follows: Generate 2"'^''^'^'^^) f7"(/) sequences according to W^^iPiui). The 
compression codebook is generated by the following two step procedure: Generate 2"'^^'^^^'^^''^^+^) V^ sequences 
according to ]X^^iP{vi). Partition the set of F" sequences into 2"(^(^'^2|c/,Y)+2e) ^:^^^^ B{Mj). 

For encoding, at the end of block j, assume that the codeword U^^rrij) was sent. The encoder then finds a V"{j) 
sequence that is jointly typical with {U'"'{raj), 'S'2 (j))- ^^ there is more than one such sequence, it picks from one 
uniformly at random from the set of jointly typical sequences. This operation succeeds with high probability as 
n -^ 00 since there are 2"'^'^''^^''^'+'^) ^"0) sequences. The encoder then finds the bin index A/j+i such that 
V"^ e B{Mjj^i). It then sends out the index Mj+i in block j + 1 by selecting L/"(j + 1) and sending out the X" 
sequence encoded as Xi = f{ui{Mj+i),S2i{j + !))• For the first block, the encoder sends an arbitrary message. 
This encoding operation requires the condition that 

I{U-X)-e>I{V;S2\U,Y) + 2e. 
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For decoding, at the end of block j + 1, the decoder first decodes the bin index A/j+i. From standard arguments 
(see for e.g. ifTTl Chapter 7]), this decoding operation succeeds with high probabiHty provided 

I{U;Y) - e > I{V; S2\UX} + 2e. 

Once the decoder recovers the bin index Mj+i, it then recovers the true V"{j) codeword by looking for w"(j) £ 
B{Mj-i-i) such that (w"("t-j),?/"0'), w"(j)) S Te ■ It then reconstructs ^"(j) as si{ui{mj),yi{j),Vi{j)) for 
i e [1 : n]. From the rates given and standard arguments (see ifTTl Chapter 3 and Chapter 11]), the expected 
distortion for S'"(j) in block j is less than or equal to ¥jd{Si,Si{U,V,Y)) + (e), where (e) ^- as e ^ 0. 
This decoding and reconstruction procedure applies for the first B — 1 blocks and for the Bth block, we simply 
reconstruct Si{B) according to an arbitrary symbol si e Si, incurring a distortion that is bounded by i'max, where 
Dma.x '■= maxgj £^(5*1, si). The per symbol distortion over _B-blocks is then upper bounded by D + '(e) where 
;(e) ^0 as e^O. 

We now note that the above achievability scheme takes care of the case when I{U;Y) > I{V;S2\U,Y). The 
boundary case of I{U;Y) = I{V;S2\U,Y) can be handled as follow. Assume first that I{U;Y) > 0. Define 
U' — {U, Q), Q € {1,2} independent of other random variables, V' = V when Q = 1 and V — 9 when Q = 2. 
X = f{U, S2) regardless of Q and si([/', V , Y') = si{U, V,Y) if Q = 1 and §1 if Q = 2, where §1 is an arbitrary 
symbol belonging to Si. Let P(Q = 1) = pi- We have 

I{U';Y')>I{U;Y), 
I{r;S2\U'X)=piIiV;S2\U,Y), 
Ed(5i, 5i((7', V, Y')) < piEd{Si,Si{U, V, Y)) + (1 - pi)A„ax. 

With this choice of random variables, /([/'; Y') > I{V'; S2\U' , Y') whenever pi < 1 and we can then apply the 
achievability scheme we discussed, at the expense of larger distortion. By choosing pi (n) = 1 — e„, where e„ — !• 
as n — s- 00, we can apply our achievability scheme for blocklength n sufficiently large, with the resulting expected 
distortion converging to F,d{Si, Si{U, V, Y)) as n ^- 00. 

For the case of I{U; Y) = I{V; S2\U, Y) = 0, it can be shown that in this case, the decoder can perform the 
reconstruction based only on si{Yi,Ui) for i G [1 : n]. Achievability in this case requires no block Markov coding. 
We only need to generate one transmission codeword C/" and transmit X" according to Xi ~ f{ui,S2i)- The 
decoder reconstructs 5" as si{ui,yi) for i e [1 : n]. 

Appendix II 
Proof of Claim[T] 

The causal region in Theorem [T] is given by 

mill EdiSi,SiiU,V,Y)) 
subject to 

I{U;Y)>I{V;S2\U,Y) 
EX <C 

for some p{u)p{v\u, S2) and function x{u,S2)- We prove that £'(0.11)niin-causai > by contradiction. Suppose 
that there exists U, V satisfying the constraints such that E ^(5*1, Si{U, V, Y)) = 0. This implies in particular that 
H{Si\U,V,Y) =0. Hence, 

IiV;S2\U,Y) ^ IiV,Si;S2\U,Y) 

>I{Si;S2\U,Y) 

= H{Si\U,Y)-H{Si\S2,U,Y) 

= H{Si\U,Y) 

= H{Si,Y\U) - H{Y\U) 

^ H{Si) + H{Y\U,Si) - H{Y\U). 
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The last step follows from U being independent of 5*1. Since we require I{U; Y) > I{V; S2\U, Y), and we know 
that H{Si) + H{Y\U, Si) - H{Y\U) < I{V; S2\U, Y), a necessary condition for Ed{Si, Si{U, V, Y)) = is 

H{Si) + H{Y\U, Si) - H{Y\U) < I{U; Y), 
^ H{Si) + H{Y\U,Si) < H{Y). 

Define the subsets ofU as follows. Uq :— {u : x{u, S2) = 0}; Ui := {u : x{u, S2) = 1}; Us := {u : x{u, S2) ~ S2}; 
and Us := {u : x{u, S2) = 10 S2}. Note the following. 

• For u e Uq, H{Y\U = u, Si) = 1 since 5*2 is independent of U, Si. 

• For u e Ui, H{Y\U = u, Si) = 1 since 5*2 is independent of U, Si. 

• For u G Us, H{Y\U = u, Si) = since 5*2 ® X = and Y = 5*2 © X © 5i. 

• For u e Us, HiY\U = u, Si) = since 5*2 © X = 1 and F = 5*2 © X © 5i. 

Further, define p^o = EueUo P^^)' P^^ = T,ueUi Pi'^)- P'^ == E„ew, P("); ^"d Ps = T,ueu, P(^)- Then, 

H{Si) + H{Y\U, Si) - H2{pi) +PuO+Pui 

= H2{pi) + l-Cs, 

where Cs ~ Ps + Ps- 

The cost constraint can be expressed as 

EX=pi + -{ps+p-s) 
^Pl + -Cs 

<c, 

where C = 0.11. In particular, the cost constraint implies that Cs < 2C. Hence, 

HiSi) + H(Y\U, Si)>l+ H2{q) - 2C. 

Now, since pi =0.1 and C = 0.11, we see that 

H{Si) + H{Y\U,Si) > 1 

>H{Y), 

which is a contradiction. 

Appendix III 
Derivation of Theorem[5] 

The derivation of Theorem |5] follows from choosing the auxiliary random variables in Theorem |2] Starting from 
Theorem m let 



X 



E{S2X') = (3^1^ 

where P' is a quantity to be calculated, and a and /3 are restricted to be between -1 to 1 to satisfy the power 
constraints. Observe that X is a function of U, S2 as required. For convenience, we will use the notation X|y to 
denote Minimum Mean Square Error of X given Y. The reconstruction function is given by 

^i = E(5i|t/,y). 

We now determine P' from other variables using E X'^ = P. 



u 


= X' + -fS2, 


«v 


%s,... 


X' 


-N{0,P'), 



E X^ = a^P + P' + 2af3VTP = P. 
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Solving for P' gives 



/p> 



-2al3VP + V4a2^2p^_4(i_^2)p 



To satisfy the constraint in Theorem |2] we require 

h{U\S2) > h{U\Y). 
Since U,S2,y are all Gaussian random variables, this condition reduces to 

U\S2 > U\Y. 
Now, 

U\S2 = X'\S2 

= (1-/32)P'. 
As for U\Y, we have 



and 



U\Y = EU■'- 



EU^^P'+ 27/3 /P^ + 72^2 



2 iEiuY)r 



Ey2 



E{UY) = E((X' + 7^2)(aA/ ir^2 + X')) + /?/P^ + 7P2 

V P2 



yP^/PP' + P' + a-^yfPPi + ll3y/P'P2 + l3y/P'P2 + 7^2, 



EY^ ^EX^+ES^+ESi + 2E{S2{aJ—S2 

V ^2 

= P + 1 + P2 + 2a/p^ + 2/3/P^. 
The expected distortion is then given by is 5i|(Y, [/), which is 



X')) 



Si\{U,Y) = 1 - [E{USi)E{YSi)] 



E[/2 E(t/y) 

E([/r) EY^ 



1 -1 



E(C/5i) 

E(r5i) 



We note now that E{USi) = and E(S'iy) = 1. The lower bound therefore works out to 

E[/2 



5i|(c/,r) = i- 



Ey2Ec/2_ (Ec/y)2' 



Appendix IV 
Proof of Theorem[8] 

As Theorem |8] is a generalization of Proposition |7] the proof of this Theorem also follows closely that of 
Proposition |7] As such, we will only mention areas where there are differences from the proof in Proposition |7]and 
refer readers to Proposition 0for the rest of the proof. 

As we mentioned before, we generalize the bound by not assuming that X'"" = 0. Instead, let us assume that 
under the mismatched distribution Q, X is distributed i.i.d according to X ~ 082 + Z, where Z ^ N{0, rP) 
independent of 5*2 and r > 0. Under this assumption, MSEq{'^) and -D(Py„|5n||Qyi|5n) used in the proof of 
Proposition |7] are now different. The bounds on MSEq{--^) and the divergence between the true distribution and 
the mismatched distribution are therefore different. We calculate them as follow. 

Define a as 

1 

a := , 

l + lPi 
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where P/ = (1 + c)2p2 + rP. Let E ||X"||^ = nx^, where x^ < P. We now have, for MSEQ{-f), 

-MSEoil) = - E ll^r - a(X" + S'^ + S^)f 
n n 

= - E US'" - aiSl' + S'")|P E < 5" - a{S\' + S'"),X" > +— E ||X"|| 

= li^^ + c,2p, + ^ E < S;\ X" > +a2x2 

7 n 

7Pj ^2 2^2 



E<S'?,X">+- ^— -ttx^. 



(l + 7P/)2 (l + 7P/)2 n 

Pi Pi , ^2 , 2 ^ _„ ,^„ ^ , 1 



I + 7P/ (l+7Pj)2 (l + 7P/)2 n(l + 7P/)2 2> (I + 7P/ 

It remains to calculate an upper bound on the divergence. As before, Pynion ~ N{S'2 + X", -I), but now, 






i*^'''-' -V, ^((1 + c)S2, (- + rP)I). The (conditional) divergence is now given by 



_^n/'p(T) iin(T) \_ 1 i_i,.„Y ^_ ^j T 



n 



D(Pi7J,^J\QVL^„) = 1 -log( ) + —, ■ rl 



IX" - cs: 



n 1 1 2 



Combining the divergence bound after taking expectation over 5^ with the MSEq bound after integration gives 
(see ([Tol l in the proof of Proposition |7]i 



l+joPi (1+71^/) (I + 70P/) P/(l + 7i^/) 
P2 ^271 ^ 1 , _ ^_^ 1 



P2 hl + log( ) 

P/(l+7o^/) 1 + 71^^ l + 7irP ^4 + 7irP^ 

+ ( I I 11 W2 

^P/(l+7oP/) P/(l+7i^/) l + lirP' 

1 1 C7i E<S'^',X"> 

^ ^P/(l+7oP/) " P/(l+7iP/) ^ l+7irP^ ^5^ 

> log(- 



■I+70P7' (1+71^/) (I + 70P/) P/(l + 7i^/) 

i^2 c27i 1 , , 1 



P/(l+7oP/) 1+71?'^ l+7rP "''l + 7rP' 

+ ( I I 11 W2 

^P/(l+7oP/) P/(l+7i^/) 1 + 71^^ 

1 1 C'7i / 

~ '^^P/(l + 7oP/) " P7(l + 7i^/) ^ 1 + 717-^^^ 
The final line follows from successive application of Cauchy-Schwartz on E < SJjX" >. Minimizing the 
bound over Ixl < a/P then gives the generalized lower bound. Let a = (-n-rrr — st — d-ttt — dt — tt^^—d) and 
b = |2( p^(i^\^p^) - tmTT^TpT) + TT^)I Vi^- We note that 6 < and let f{x) = ax^ - b\x\. 

We note that if a < 0, f{x) is symmetric and decreasing in x. Therefore, we set x* = \/P. If a > 0, then 
X* = 6/ (2a) if 6/ (2a) < VP and x* = VP otherwise. The generalized lower bound is now given by 



21^^MMS£;(7o)>log('+^^''^- ' ' "'^ 



I + 70P1 (1 + 71^/) (1+70^/) P/(l+7i^/) 



^2-— - + l+log(- 



P/(l+7oP/) l + 7irP 1 + 77-P ""l + 7rP 

+ ( \ I H )^*2 

^Pz(l + 7oP/) P/(l + 7i^/) l + 7ir-P^ 

' ^P/(l + 7o/'/) P/(l + 7i^/) 1+71^^'^ '"^ ' 
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where we optimize over 71 > 70, 
proof. 



> and c e TZ. Noting that MMSE{l)/n = D{P)nv,n then completes the 



Appendix V 
Sketch of Theorem[9] 

The achievabihty scheme in Theorem |9] closely resembles |[T2l and involves allocating a fraction of the power for 
transmitting a message (corresponding to a compressed version of the desired source Si) using dirty paper coding 
and using the remaining power for uncoded transmission of a linear combination of Si and 5*2. The compressed 
index is generated based on Wyner-Ziv coding and then transmitted reliably over channel using dirty paper coding 
as in 1121 . The bin indices in Wyner-Ziv coding are transmitted at a rate equal to the capacity of the dirty paper 
channel. Note that the interference in this channel also includes the signal due to uncoded transmission created at 
the encoder. The compressed index is decoded at the receiver using the receiver side information Y and both the 
decoded codeword and Y are used to estimate the source Si. Uncoded transmission helps in improving the signal 
to noise ratio of the desired signal Si in Y . 

Let 



U = X' 




X -^ 

V n 

X'-AA(0,P(l-a2-/32)), 

Y = X + Si+S2 + Z, 

where X' is independent of 5*1 and S2 and corresponds to the coded part of the signal. Auxiliary U is used to 
cancel the total interference to X' as in dirty paper coding. The total interference is equal to {aJ-p- + \\Si + 

[Px -p- + 1 ) S2- As a result, a clean channel (without interference) is created between X' and Y, which can be 



used to transmit the description of Si at a Wyner-Ziv rate equal to ^ log f 1 -\ ^- — '^ — ^ 1 . The received signal 

Y can also be seen as a noisy version of the desired signal 5*1, and is used along with the message transmitted to 
reconstruct Si. 

Therefore, the resulting distortion in 5*1 is given by 



D = 



Pi 



-1 Pi 



(/3y/^+l) P2+P(l-a^-P)+N 



N 
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